Searching for the earliest galaxies in the 21 cm forest 
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ABSTRACT 

We use a model developed by IXu et al~ ( 2010h to compute the 21 cm line absorption 
signatures imprinted by star- forming dwarf galaxies (DGs) and starless minihalos (MHs). The 
method, based on a statistical comparison of the equivalent width (W v ) distribution and flux 
correlation function, allows us to derive a simple selection criteria for candidate DGs at very 
high [z > 8) redshift. We find that ~ 18% of the total number of DGs along a line of sight to 
a target radio source (GRB or quasar) can be identified by the condition W u < 0; these objects 
correspond to the high-mass tail of the DG distribution at high redshift, and are embedded 
in large HII regions. The criterion W v > 0.37 kHz instead selects « 11% of MHs. Selected 
candidate DGs could later be re-observed in the near-IR by the JWST with high efficiency, thus 
providing a direct probe of the most likely reionization sources. 



Subject headings: galaxies: dwarf 
- radio lines: galaxies. 

INTRODUCTION 



galaxies: high-redshift - intergalactic medium - cosmology: theory 



The properties of the earliest galaxies, such 
as their star formation histories, masses, produc- 
tion of ionizing photons and their escape fraction, 
are crucial in understanding the reionization pro- 
cess, during which the previously neutral inter- 
galactic medium (IGM) becomes totally ionized. 
Thanks to the availability of large ground-based 
and space telescopes, and improvements in the 
searching technique for Lyman Break Galaxies and 
Lyman-a Emitters, we are now tracing galaxy for- 
mation a^_grog£e^siyelyhigher redshifts beyond 
6 (see I Bunker et al. 20091 for a review). Candi- 
dates at redshifts as high as z ~ 10 are newly 
reported from the a nalysis of the Hubbl e Ultra 
Deep Field (HUDF) (jBouwens et al.ll2009h . How- 
ever, it is now believed that the galaxies that pro- 
duced most o f the necessary (re)ion i zing p hotons 
were dwarfs ( Choudhurv fc Ferraral 12007 ) which 
are currently beyond our capability of direct de- 



tection. The forthcoming James Webb Space Tele- 
scope (JWST) will have the capabilities to di- 
rectly detect the reionization sources at the faint 
end of the luminosity function. Still, given their 
faintness, long integration times will be required; 
hence, defining target candidate reionizing sources 
will be of primary importance to study them in 
spectroscopic detail. 

Instead of looking at a specific galaxy di- 
rectly, the redshifted 21 cm transition of HI 
traces the neutral gas either in the diffuse IGM 
or in non-linear structures, comprising the most 
pro mising probe of the reionizatio n process (see 



e.g. iFurlanetto. Oh fc Briggsl 120061 for a review). 
While the 21 cm tomography maps out the three 
dimensional structure of the 21 cm emission or 
absorption by the IGM against th e cosmic mi 
crowave background ( CMB) (e.g 



JO) 



Madau et al 



1997; Tozzi et al. 2000), the 21 cm forest observa- 



tion detects absorption lines of intervening struc- 
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tures towards high redshift radio sour ces show- 



ing h igh sensitivity to gas temperature ()Xu et al 



2009). The problem of the 21 cm forest signatures 



produced by different kinds of structures has been 
addressed by several authors. ICarilli et al. (2002) 
presented a detailed study of 21 cm absorption by 
the mean neutral IGM as well as filainentary struc- 
tures based on the simulations of Gnedin (2000), 
but their box was too small to account for large 
scale structures and w as not able to resolve col - 
lapsed objects. Instead. Furlanetto fc Loebl ( 2002 ) 
used a simple analytic model to compute the ab- 
sorption profiles and abund ances of mi n ihalos and 
galactic disks. Later on, iFurlanettol (|2006h re- 
examined four kinds of 21 cm forest absorption 
signatures in a broader context, especially the 
transmission gaps produced by ionized bubbles. 

Recently, IXu" et al.l (|2010h developed a more 
detailed model of the 21 cm absorption lines of 
minihalos (i.e. starless galaxies, MHs) and dwarf 
galaxies (star-forming galaxies, DGs) during the 
epoch of rcionization, explored the physical ori- 
gins of the line profiles, and generated synthetic 
spectra of the 21 cm forest on top of both high-z 
quasars and gamma ray burst (GRB) afterglows. 
Interestingly, they find that: (i) MHs and DGs 
show very distinct 21 cm line absorption profiles 
(ii) they contribute differently to the spectrum due 
to the mass segregation between the two popula- 
tions. It follows that it is in principle possible to 
build a criterion based on the 21 cm forest spec- 
trum to efficiently select DGs. 

The goal of this work is to derive the differ- 
ent signatures of DGs and MHs using a 21 cm 
spectrum of high-z radio sources, and provide a 
criterion to pick DGs lines in the spectrum. For 
these candidates, precise redshift information will 
be available; moreover, given the angular position 
of the background source, the 21 cm forest obser- 
vation provides an excellent tool for locating the 
high-z DGs. The great advantage of using high- 
z GRBs as background radio sources is that the 
follow-up IR JWST observations after the after- 
glow has faded away will not be hampered by the 
presence of a very luminous source (as in the case 
of a background quasar) in the field_J. 



1 Throughout this paper, we adopt the cosmological pa- 
rameters from WMAP5 measurements combined with SN 
and BAO data: Q. b = 0.0462, Q, c = 0.233, Q A = 0.721, 
H = 70.1kms- 1 Mpc- 1 , a 8 = 0.817, and n B = 0.96 



2. METHOD 

Here we briefly summarize the main features of 
the model used in this w ork, b ut refer the inter- 

(l2010h for 



ested reader to Xu et al 



)r a more com- 



prehensive description . We use the Sheth-To rmcn 
halo mass function (jSheth fe Tormenl Il999l) to 
model the halo number density at high redshift 
in the mass range 1O 5_1O M0, which covers the 



minimum mass allowed to co llapse (jAbel et al 



20001: lO'Shea fe Normanl l2007t ) and most of the 
galaxies that are responsi ble for reionization 
(fchoudhurv fe Ferraral 120071) . The dark matter 
halos have an NFW density profile within th e 
virial radii r v i r ([Navarro. Frenk fc White! I1997 K 
with a concentration paramet er fitted t o high - 
redshift simulation results by iGao et al.1 (120051) : 
the dark matter density and velocity structure 
outside described by an "Infall model"| 



(Barkana 2004). The gas inside the r v ; r is assumed 



to be in hydrostatic equilibrium at temperature 
T v ; r in the dark matter potential, while the gas 
outside follows the dark matter overdensity and 
velocity field. 

Once the halo population is fixed, a timescale 
criterion for star formation is introduced to deter- 
mine whether a halo is capable of hosting star for- 
mation. The timescale required for turning on star 
formation is modeled as the maximum between 
the free-fall time ts a nd the H2 cooling time i coo i 
(jTegmark et al.lll997l) . i.e. <sb = max{£ff, t coo \}. 



Then star formation activity begins at t s = t-p + 
tsB, where tp is the halo for mation time predicte d 
by the standard EPS model (|Lacev fc Colelll993h . 
If t s is larger than the Hubble time at the halo 
redshift, we define the system as a minihalo, i.e. 
a starless galaxy. The ionized fraction in a MH is 
computed with collisional ionization equilibrium, 
which depends on its temperature. The gas within 
r v i r is at the virial temperature, and in the absence 
of an X-ray background the gas outside is adiabat- 
ically compressed, so that the temperature is sim- 
ply T K = Ti GM (l + 6) 7 ~\ where 7 = 5/3 is the 
adiabatic index for atomic hydrogen, and Tigm is 
the temperature of the mean-density IGM. The 
Lya photons inside a MH are produced by recom- 
binations, which are negligible for most MHs that 



l|Komatsu et al.ll2009h 
2 Public code available 

http: / /wise-ob s.tau.ac.il/| " bar kana/codes. html 



at 



2 



are almost neutral, but serve as a dominant cou- 
pling source for the most massive MHs which are 
partially ionized due to their higher T V i r . 

When the criterion t s < tn is satisfied, star for- 
mation occurs within a Hubble time turning the 
halo into a dwarf galaxy. We use a mass-dependent 
handy fit of star formation efficiency provided by 
Salvadori fe Ferraral (2009). Adopting the spec- 



tra of high redshift st arburst galaxies provided by 
ISchaerer (2002, 2003f| and an escape fraction of 
/esc = 0.07 a s favored by the early reionization 
model (ERM, iGallerani et al.l 120081) , we numeri- 
cally follow the expansion of the HII region. The 
gas temperature inside the HII region is fixed at 
2 x 10 4 K, while the temperature of gas around 
the HII region is calculated including the Hubble 
expansion, soft X-ray heating and the Compton 
heating. Although the soft X-ray heating domi- 
nates over the Compton heating, its effect is weak 
unless the DG has a higher stellar mass and/or 
a top-heavy initial mass function. Besides ioniza- 
tion and heating effects, the DG metal-free stellar 
population produces Lya photons from soft X-ray 
cascading, which could penetrate into the nearby 
IGM and help to couple the spin temperature to 
the kinetic temperature of the gas. Finally, we 
account for the Lya background produced by the 
collective contribution of DGs. 

With the detailed modeling of properties of 
both MHs and dwarf galaxies, and an associated 
Lya background, we compute the 21 cm absorp- 
tion lines of these non-linear structures. The 
diffuse IGM creates a global decrement in the 
spectra of high-z radio sources, on top of which 
MHs and DGs produce deep and narrow absorp- 
tion lines. The 21 cm opti c al depth of an iso- 
lated object is (iFieldl Il959i: iMadau et ail Il997 : 
Furlanetto fe Loebll2002] )~ 



t(u) 



3h P c 3 A w 1 

32 7T 3 / 2 fc B ^ 

nm(r) 
, b(r)T s (r) 



exp 



(u(y) — v{r)Y 



where A w = 2.85 x 10 5 s _1 is the Einstein coeffi- 
cient for the spontaneous decay of the 21 cm tran- 
sition, riHi is the neutral hydrogen number density, 
Ts is the spin temperature, and b(r) is the Doppler 



parameter of the gas, b(r) — y/ 2 k^TK{r)/mn- 
Here u(v) is the frequency difference from the line 
center in terms of velocity, u{v) = c {v — ^io)/^io, 
where v\q — 1420.4 MHz is the rest frame fre- 
quency of the 21 cm line, and v(r) is bulk velocity 
of gas projected onto the line of sight at radius 
r. Inside of the virial radius, the gas is thermal- 
ized, and v(r) = 0, while the gas outside the virial 
radius has a bulk velocity contributed from both 
infall and Hubble flow, which is predicted by the 
"Infall Model." The coordinate x is related to the 
radius r by r 2 — (ar v j r ) 2 + x 2 , where a is the im- 
pact parameter of the penetrating line of sight in 
units of r V j r . 

The spin temperature of neutral hydrogen is de- 
fined by the relative occupation numbers of the 
two hyperfine structure levels, and it is determi ned 
by (|Fieldlll958tlFurlanetto Oh fc Briggsll2006l ): 



3 http: / / cdsarc.u-strasbg.fr / cgi-bin/Cat? VI /109 



rp-\ _ ^ 1 + x c T K + x a T c 

1 X c -\- Xol 

where T 7 = 2.726(l+z) K is the CMB temperature 
at redshift z, Tk is the gas kinetic temperature, 
and Tc is the effective color temperature of the 
UV radiation. In most c ases, Tp = Tk due to the 
frequ ent Lya scattering ( Furlanetto. Oh fc Briggsl 
2006). The collisional coupling is described by 
the coefficient x c , and x a is the coupling coef- 
ficient of the Lya pumping effect known as the 
Wouthuyse n-Field coupling (jWouthuvsenl Il952 ; 
Fieldlll958l) . The main contributions to H-H 
collisions and H-e~ collisions, which can be writ- 
ten as x c = xf + xf 1 = (n e K$t/A 10 ) (T*/T 7 ) + 
(nmK^Q- /A 10 ) (T*/T 7 ), where T* = 0.0682 K is 
the equivalent temperature of the energy splitting 
of the 21 cm transition, and and are the 
de-excitation rate coefficients in collisions with 
free electrons and hydrogen atoms, respectively. 
The coupling coefficient x a is proportional to the 
total scattering rate between Lya photons and 
hydrogen atoms, x a = (4P Q /27 A w ) (T*/T 7 ), 
where the scattering rate P a is given by P a = 

9 

7re 

caaU^/Auo = A-rtOaJa- Here a a = f a 

m e c 

where f a — 0.4162 is the oscillator strength of the 
Lya transition, n}" 1 is the total number density 
of Lya photons, J a is the number intensity of the 
Lya photons, and Av^ = (b/c) v a is the Doppler 
width with b being the Doppler parameter and v a 
being the Lya frequency. 
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Fig. 1. — Relative transmission along a line of sight. Left: the spectrum with absorptions caused by DGs 
alone from 129 MHz to 158 MHz corresponding to z = 10.01 - 7.99. Right: the spectrum with absorptions 
caused by MHs alone from 129 MHz to 130 MHz corresponding to z = 10.01 - 9.93. 



3. RESULTS 

3.1. The spectra of dwarfs and minihalos 

With the halo number density predicted by 
the Shcth-Tormen mass function and the cross- 
sectional area of halos determined by the mean 
halo separation, we derive the number density of 
the absorption lines. Applying our star forma- 
tion criterion to each intersected halo with Monte- 
Carlo sampled mass and formation-redshift, we 
compute the absorption lines of MHs in colli- 
sional ionization equilibrium or DGs photoionized 
by central stars, and generate a synthetic spec- 
trum along a line of sight. The entire spectrum 
wi th absorpt i ons o f both MHs and DGs is shown 



Xu et al. ll201Cft against a quasar or a GRB 



afterglow. In order to illustrate the differences 
between the spectrum caused by MHs and that 
by DGs, and disregarding the background source 
properties, we plot the relative transmission T = 
exp(— r) of a spectrum with absorptions caused 
by dwarf galaxies alone (DG-spectrum) in the left 
panel of FigQ] and that with absorptions caused 
by MHs alone (MH-spectrum) in the right panel, 
respectively. Note that the ranges of observed fre- 
quency are different in the two panels. The ab- 
sorption lines are very narrow and closely spaced 
in the MH-spectrum which resembles a 21 cm for- 
est, while the 21 cm lines on the DG-spectrum 
are much rarer. A clear signature unique to the 



DG-spectrum is that there are some dwarf galax- 
ies with sufficiently large HII regions that give rise 
to leaks (i.e. negative absorption lines with equiv- 
alent width W„ < 0) on the spectrum rather than 
absorption lines. Also, we see that absorption lines 
of MHs are generally deeper than DGs. 

We define 5(v) to be the relative difference be- 
tween the flux / at observed frequency v and the 
global flux transmitted through the homogeneous 
IGM at the corresponding redshift, 



5v = m-i. 



IGM 



(3) 



Then we compute the flux correlation functions 
with the formula 



1 N 



(4) 



where N is the total number of point pairs on the 
spectrum with a frequency distance of Av. The 
subscript "ab" takes the values "gg" ( "hh" ) for the 
auto-correlation function of the DG (MH) spec- 
trum, or "hg" for the cross-correlation between the 
MH and DG spectra. 

We plot these correlation functions with solid 
curves in Figf2] From the auto-correlation func- 
tion of the DG-spectrum, we see little correlation 
on frequency separations larger than 10 kHz, while 
little correlation on frequency separations larger 
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than 1 kHz is seen in the MH-spectrum. This is 
what we could expect for a randomly generated 
spectrum with only the halo mass function. The 
correlation seen on smaller frequency separations 
just comes from the point pairs located within the 
same lines. The DGs have a longer correlation 
length because of their broader absorption lines 
or leaks, but the amplitude of the correlation is 
very low because they are rare. 

In order to illustrate the contribution of those 
dwarfs with negative absorption, we calculated 
the auto-correlation function of the DG-spectrum 
with leaks excluded; the result is shown as the 
dashed curve in the upper panel of Fig|2] The 
correlation amplitude is smaller due to the reduced 
number of signals, and the correlation length is re- 
duced by more than one order of magnitude. This 
shows that the broad signals are caused primarily 
by those leaks. In addition, comparing this dashed 
curve with the solid curve in the bottom panel, we 
find that on average, an absorption line of a DG is 
even narrower than that of a MH. This is because 
some of these DGs produce HII regions larger than 
r v i r , and the absorption lines from the gas outside 
the virial radii but inside the HII region, which 
has the largest infalling velocities, are erased. 

The cross-correlation function between the 
MHs and DGs can also be obtained in the same 
way. In this work we have not considered the clus- 
tering property of the MHs and DGs arising from 
large scale structure, so the flux cross-correlation 
should be zero, except for the Poisson fluctuations. 
Our computation confirms this expectation. 

3.2. Equivalent width distributions 

Directly from the spectrum, we could compute 
the distribution of equivalent width (EW) of the 
absorption lines for a specific range of observed 
frequency corresponding to a specific redshift. As 
the continuum of a background source has a global 
decrement due to the absorption of the diffuse 
IGM, the real signal of non-linear structures is the 
extra absorption with respect to the flux trans- 
mitted through the IGM. Therefore, the EW of 
an absorption line should be defined as 
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Av [kHz] 

Fig. 2. — The flux correlation functions of the 
21 cm forest spectrum. Upper panel: the auto- 
correlation functions of the DG-spectrum. The 
solid curve includes all the lines while the dashed 
curve excludes the leaks with W v < 0. Bottom 
panel: the auto-correlation function of the MH- 
spectrum. The spectra used for the computation 
here are all from 129 MHz to 158 MHz correspond- 
ing to z = 10.01 - 7.99. 
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where f c is the continuum flux of the background 
radio source, and tlgm(z) is the optical depth of 
the diffuse IGM at redshift z. We compute the dif- 
ferential and cumulative distributions of EW for 
the DG-spectrum in the left panel and for the 
MH-spectrum in the right panel respectively in 
Fig El The histograms represent the number dis- 
tributions and the solid curves are the cumulative 
distributions per redshift interval. 

We see that both EW distributions for DGs 
and MHs peak in the same region at W v ~ 0.1 
- 0.3 kHz. This means that most of the dwarfs 
and MHs have comparable EWs, and we cannot 
distinguish them only from their EWs. However, 
the distribution curves of their EWs show differ- 
ent shapes. The EW distribution of DGs has a 
long tail at the small EW end, while MHs have a 
large-EW tail in the distribution curve. 

3.3. Selection criteria 

In this subsection, we aim at deriving a criterion 
to distinguish between the absorption lines caused 
by DGs and those caused by MHs. 

From the computation of EW distributions, we 
find that only dwarf galaxies cause negative ab- 
sorptions and thus have W v < 0, while only MHs 
are found to have EWs above 0.37 kHz. There- 
fore, the first criterion for candidate DGs could 
be W v < 0, with which we select 10 dwarfs out 
of 54 in our synthetic spectrum. They have a 
100 percent probability of being caused by DGs. 
That means, we can find 18.5% of the total dwarfs 
along the line of sight, and they are relatively large 
dwarfs with large HII regions. In addition, with 
the predicted EW distribution, we can estimate 
the total number of DGs in the spectrum with 
the number of selected dwarf galaxies that have 
negative absorptions. Similarly, using the second 
criterion W v > 0.37 kHz, we can select 812 MHs 
out of a total of 7108. This is 11.4% of all the 
MHs, which cannot be misidentified as DGs. 

From the correlation functions shown above, on 
the other hand, we see that DGs have a longer cor- 
relation length than MHs. In the absence of halo 
clustering information, the correlation length re- 
flects the mean width of the absorption lines, so 
this is also a distinctive signature of dwarfs from 
MHs. However, we have demonstrated that the 
correlation of dwarfs at relatively large frequency 



distances are exactly caused by those with nega- 
tive absorptions. Therefore, the criterion of broad 
absorption is degenerate with the negative tail of 
the EW distribution of the DGs. Excluding those 
dwarfs with negative absorptions, the mean width 
of an absorption line of a DG (about 0.3 kHz) is 
even smaller than that of an MH (about 0.5 kHz). 
Hence, for the lines with < W v < 0.37 kHz, if 
we have infinite resolution, a narrower absorption 
line will have a higher probability of being caused 
by a dwarf galaxy. This is probably beyond the 
resolution capabilities of currently planned instru- 
ments. The probability of these absorption lines 
being a DG would be ~ 44/6296 - 0.7%, with the 
complementary probability attributed to MHs. 

4. DISCUSSION 



Using the model developed bv lXu et"aL ( 2010f ). 
we have computed 21 cm absorption line spec- 
tra ( "forest" ) caused by DGs and MHs separately, 
their flux correlation functions and EW distribu- 
tions, with the aim of distinguishing DGs from 
MHs in a statistical way. With the selection crite- 
rion of W v < 0, we are able to identify ~ 18.5% of 
DGs, and the criterion of W„ > 0.37 kHz selects 
~ 11.4% of MHs. As a whole, we can disentan- 
gle ~ 11.5% of all the non- linear objects along a 
line of sight for which we can tell whether they 
are DGs or MHs. In this way, we find a strong 
but simple criterion to select candidate DGs to be 
later re-observed in the optical or infrared. Us- 
ing the radio afterglow of a high redshift GRB as 
the background, this selection strategy could be 
accomplished by LOFAR or SKA. Then, after the 
GRB fades away, the follow-up observations can 
be carried out by JWST, which will be capable of 
directly detecting the DGs that are responsible for 
reionization. 

Cosmic voids can also produce negative absorp- 
tions with respect to the mean absorption by the 
IGM. Accounting for the density voids requires 
the clustering information of large scale struc- 
ture which is not included in our computation. 
However, according to the void size distribution 
based on the excursion set model developed by 
Sheth fc Wevgaertl (|2004h . the characteristic scale 
of a density void is much larger than that of a DG 
HII region. As a result, density voids will pro- 
duce "transmissivity windows" which are about 



6 




-3 -2 -1 0.5 0.5 1 1.5 



W v [kHz] W v [kHz] 

Fig. 3. — The differential and cumulative distributions of equivalent width of the 21 cm absorption lines. 
Left: the distribution for the DG-spectrum. Right: the distribution for the MH-spectrum. The spectra used 
for the computation here are both from 129 MHz to 158 MHz corresponding to z = 10.01 - 7.99. 



one order of magnitude broader than the "leaks" 
produced by HII regions. As the width of both 
"transmissivity windows" and "leaks" exceed the 
current spectral resolution, a second criterion of 
signal width could be applied to eliminate those 
voids. Further, it is not necessary to consider the 
so-called "mixing problem" between the densit; 



voids and the HII regions as iShang et al.l (|200 



55 



did for the Lya forest, because the dwarfs are more 
likely to exist in filaments out of the voids, and 
they are not likely to mix with the density voids. 

While the selection criterion for candidate DGs 
is reliable, the total number of predicted DGs and 
the percentages of identifiable objects are model- 
dependent. Specifically, they depend on the star 
formation law and efficiency, stellar initial mass 
function, and / csc . However, the fraction of dwarfs 
having negative absorption depends only on the 
shape of the EW distribution. If different star for- 
mation models predict similar shapes of the EW 
distribution, then our prediction of the fraction of 
dwarfs producing leaks is quite reliable and model- 
independent, and the total number of DGs along 
a given line of sight can be safely inferred from 
the number of selected leaks. Otherwise, we could 
compare the total number of dwarfs inferred from 
the percentage argument with the one originally 
predicted from our star formation model, and use 
this result to constrain the model. 

To improve on the current selection criteria, 



the next step is to include the clustering prop- 
erties of dark matter halos. With this ingredient 
included, the correlation functions will retain ad- 
ditional information on the distances between the 
lines. In principle, knowing the shape of the cor- 
relation function, one could associate to any given 
line in the spectrum (e.g. by using Bayesian meth- 
ods) the statistical probability that it arises from a 
DG. We reserve these and other aspects to future 
work. 
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